A declarative extension of parsing expression grammars for recognizing most programming languages

نویسندگان

  • Tetsuro Matsumura
  • Kimio Kuramitsu
چکیده

Parsing Expression Grammars are a popular foundation for describing syntax. Unfortunately, several syntax of programming languages are still hard to recognize with pure PEGs. Notorious cases appears: typedef-defined names in C/C++, indentation-based code layout in Python, and HERE document in many scripting languages. To recognize such PEG-hard syntax, we have addressed a declarative extension to PEGs. The ”declarative” extension means no programmed semantic actions, which are traditionally used to realize the extended parsing behavior. Nez is our extended PEG language, including symbol tables and conditional parsing. This paper demonstrates that the use of Nez Extensions can realize many practical programming languages, such as C, C#, Ruby, and Python, which involve PEG-hard syntax.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The design and implementation of Object Grammars

An Object Grammar is a variation on traditional BNF grammars, where the notation is extended to support declarative bidirectional mappings between text and object graphs. The two directions for interpreting Object Grammars are parsing and formatting. Parsing transforms text into an object graph by recognizing syntactic features and creating the corresponding object structure. In the reverse dir...

متن کامل

A Formulation of Deterministic Bottom - UpParsing and Parser Generation In

This paper addresses eecient parsing in the context of logical inference for the purpose of using logic programming languages in compiler writing. A bottom-up, deterministic parsing mechanism is formulated for \bounded right context" grammars, a subclass of LR(k) grammars with characteristics amenable to declarative parser speciica-tion. A working parser generator for a logic programming langua...

متن کامل

Principled Parsing for Indentation-Sensitive Languages

Many languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars are not able to express these layout rules, existing parsers use ad hoc techniques to handle them. These techniques tend to be low-level and operational in nature, and thus forgo the advantages of more declarative specifications like context-free grammar...

متن کامل

Practical Dynamic Grammars for Dynamic Languages

Grammars for programming languages are traditionally specified statically. They are hard to compose and reuse due to ambiguities that inevitably arise. PetitParser combines ideas from scannerless parsing, parser combinators, parsing expression grammars and packrat parsers to model grammars and parsers as objects that can be reconfigured dynamically. Through examples and benchmarks we demonstrat...

متن کامل

sbp: A Scannerless Boolean Parser

Scannerless generalized parsing techniques allow parsers to be derived directly from unified, declarative specifications. Unfortunately, in order to uniquely parse (disambiguate) existing programming languages, extensions beyond the usual contextfree formalism must be added to handle a number of specific cases. This paper describes sbp, a scannerless parser for boolean grammars, a superset of c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JIP

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2016